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A.  PROBLEM  SETTING 

Focus  here  is  on  stochastic  search  and  optimization: 

A.  Random  noise  in  input  information  (e.g.,  noisy 
measurements  of  loss  function) 

—  and/or  — 

B.  Injected  randomness  (Monte  Carlo)  in  choice  of 
algorithm  iteration  magnitudie/direction 

Contrasts  with  deterministic  methods 

—  E.g.,  steepest  descent,  Newton-Raphson,  etc. 

—  Assume  perfect  information  about  L(0)  (and  its  gradient) 

—  Search  magnitude/direction  deterministic  at  each  iteration 

Injected  randomness  (B)  in  search  magnitude/direction  can 
offer  benefits  in  efficiency  and  robustness 

—  E.g.,  Capabilities  for  global  (vs.  local)  optimization 


Some  Popular  Stochastic  Search  and 
Optimization  Techniques 

Random  search 
Stochastic  approximation 

—  Robbins-Monro  and  Kiefer-Wolfowitz 

—  SPSA 

—  NN  backpropagation 

—  Infinitesimal  perturbation  analysis 

—  Recursive  least  squares 

—  Many  others 
Simulated  annealing 

Evolutionary  computation  and  genetic  algorithms 

Reinforcement  learning 

Markov  chain  Monte  Carlo  (MCMC) 

Etc. 


Baseline  Problem  Setting  for  SPSA 

Algorithm 

Consider  standard  minimization  setting,  i.e.,  find  root  0*  to 


where  L(0)  is  scalar-valued  loss  function  to  be  minimized 
and  0  is  p-dimensional  vector 

Assume  only  (possibly  noisy)  measurements  of  L(0) 
available 

-  No  direct  measurements  of  g(0)  used,  as  are  required  in 
stochastic  gradient  methods 

Noisy  measurements  of  Z_(0)  in  areas  such  as  Monte  Carlo 
simulation,  real-time  control/estimation,  etc. 

Interested  in  p  >  1  setting  (including  p  »  1) 


B.  SPSA  ALGORITHM 


Let  gk(Q)  denote  SP  estimate  of  g(0)  at  kth  iteration 
Let  Qk  denote  estimate  for  0*  at  /(th  iteration 
SPSA  algorithm  has  form 


®fc+1  *k  &kQk(^k) 


where  {ak}  is  nonnegative  gain  sequence 

Generic  iterative  form  above  is  standard  in  SA;  stochastic 
analogue  to  steepest  descent 

Under  conditions,  0*  in  some  stochastic  sense  as 

/C— >oo 


Computation  of  gk( •)  (Heart  of  SPSA) 

•  Let  Ak  be  vector  of  p  independent  random  variables  at  Ath 
iteration 


•  Ak  typically  generated  by  Monte  Carlo 

•  Let  {ck}  be  sequence  of  positive  scalars 

•  For  iteration  k  -»  k+'\ ,  take  measurements  at  design 
levels:  0^  ±  ckAk 

y(Qk  +  ckAk )  —  L(Qk+ ck  Ak )  + 
y(Qk  ~  ckAk )  =  L(Qk  -  ckAk )  +  ejf~) 

where  are  measurement  noise  terms 

•  Common  special  case  is  when  sjf*  =  OV/r 

(e.g.,  system  identification  with  perfect  measurements 
of  the  likelihood  function) 


Computation  of  g^*)  (cont’d) 

•  The  standard  SP  form  for  gk( •): 


•  Note  that  gk( •)  only  requires  two  measurements  of  /.(•) 
independent  of  p 

•  Above  SP  form  contrasts  with  standard  finite-difference 
approximations  taking  2 p  (or  p+1)  measurements 


Intuitive  reason  why  gk( •)  is  appropriate  is  that 
E[9k(®k)  ®/<]  ~  g(0/c); formalized  in  Section  C 


Essential  Conditions  for  SPSA 


•  To  use  SPSA,  there  are  regularity  conditions  on  L(0),  choice 
of  Ak,  the  gain  sequences  {ak},  {ck},  and  the  measurement 
noise 

-  Sections  7.3  and  7.4  of  ISSO  present  essential  conditions 

•  Roughly  speaking  the  conditions  are: 

A.  L(0)  smoothness:  L(0)  is  thrice  differentiable  function 
(can  be  relaxed — see  Section  7.3  of  ISSO) 

B.  Choice  of  Ak  distribution:  For  all  k,  Ak  has  independent 
components,  symmetrically  distributed  around  0,  and 

E{A2ki)<  oo,  E(A^)<co 

-  Bounded  inverse  moments  condition  is  critical  ( excludes 
A kj  being  normally  or  uniformly  distributed) 

-  Symmetric  Bernoulli  Akj  =  ±1  (prob  =  14  for  each  outcome) 
is  allowed;  asymptotically  optimal  (see  Section  G  or  Section 
7.7  of  ISSO) 
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Essential  Conditions  for  SPSA  (cont’d) 


C.  Gain  sequences:  standard  SA  conditions: 


ak,  ck  >  0,  ak,ck  — ^  0  as  k  — ^  oo 


a  ’  ^  k 


00 


Tak=  °o>  Z 

k=0  k=0 


°o  f  o  ^ 

dk 


\Ck  J 


<  co 


(better  to  violate  some  of  these  gain  conditions  in  certain 
practical  problems;  e.g.,  nonstationary  tracking  and 
control  where  ak  =  a  >  0,  ck  =  c  >  0  V  k,  i) 

Measurement  Noise:  Martingale  difference 


E[s[+)-s{k)\Qk,Ak]  =  0 

V  k  sufficiently  large.  (Noises  not  required  to  be 
independent  of  each  other  or  of  current/previous  0fcand 
values.)  Alternative  condition  (no  martingale  mean  0 
assumption  needed)  is  that  ejf±)  be  bounded  V  k 
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Valid  and  Invalid  Perturbation  Distributions 


VALID 


- 1 - L  I - - 1 - 1 - 

0  0 

Bernoulli  Segmented  Uniform 

I 


0 

U-shaped 


Normal  Uniform  V-shaped 
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C.  THEORETICAL  FOUNDATION 
Three  Questions 

Question  1:  Is  gk( •)  a  valid  estimator  for  g(*)? 
Answer:  Yes,  under  modest  conditions. 


Question  2:  Will  the  algorithm  converge  to  0*? 
Answer:  Yes,  under  reasonable  conditions. 


Question  3:  Do  savings  in  data/iteration  lead  to  a 

corresponding  savings  in  converging  to 
optimum? 

Answer:  Yes,  under  reasonable  conditions. 
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Near  Unbiasedness  of  gk(») 

SPSA  stochastic  analogue  to  deterministic  algorithms  if  gfc(0) 
is  “on  average”  same  as  g(0)  for  any  0 

Suppressing  iteration  index  k,  mth  component  of  g(0)  is: 

„  /A.  L(0  +  cA)-L(0-cA) 

gm  (0)  =  — - - — - -  +  noise 


2cA 


m 

T 


L(0)  +  cg(0)  A  -  L(0)  -  (-cg(0)  A) 


2cA 


+  noise 


ZgiifyAi 


m 


 / 


+  noise 


'm 


A, 


=  gm(0)+  I  g,(0)-i  +  no/'se 


i^m 


im 


With  E(A;  / Am)  =  0  we  have  for  any  m\ 
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Illustration  of  Near-Unbiasedness  for  g^*)  with 
p  =  2  and  Bernoulli  Perturbations 


(a)  True  gradient  apd  four  possible  sample 
points  around  Qk 


(b)  Two  possible  search  directions  and 
magnitudes  gk(%) 


•  • 

A 

m 

Value  of  0^ 

l\\ 

1  4J 

A 

True  gradient  0{0^) 

•  **  • 

■ - 4 

Possible  estimates  f^(0^) 

A 

(c)  Mean  of  the  two  possible  ^(0*.)  values: 

■-  > 

A 

Mean  of  ffy(0^) 

E[gk(Qk)\Bk]~g(Qk) 
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Theoretical  Basis  (Sects.  7.3  -  7.4  of  ISSO) 

•  Under  appropriate  regularity  conditions  (e.g.,  E(A^f)  <  oo,  L(0) 
thrice  continuously  differentiable,  sj(±)  is  martingale  difference 
noise,  etc.),  we  have: 


Near  Unbiasedness 


E[gk(Qk)  Qk]  =  9(Qk)  +  0(c2k)  a.s. 


where  ck  -»  0 


•  Convergence: 

Qk  — ^  0  a.s.  as  k  — ^  oo 

•  Asymptotic  Normality: 

k^l2(ek  -0*)^^/\/(p,s),  o  <  p  <  % 

where  jx,  2,  and  p  depend  on  SA  gains,  Ak  distribution,  and 
shape  of  L(0) 
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Efficiency  Analysis 

•  Can  use  asymptotic  normality  to  analyze  relative  efficiency 
of  SPSA  and  FDSA  (Spall,  1992;  Sect.  7.4  of  ISSO ) 

•  Analogous  to  SPSA  asymptotic  normality  result,  FDSA  is 
also  asymptotically  normal  (Chap.  6  of  ISSO) 


The  critical  cost  in  comparing  relative  efficiency 
of  SPSA  and  FDSA  is  number  of  loss  function 
measurements  y(»),  not  number  of  iterations  per  se 


•  Loss  function  measurements  represent  main  cost  (by 
far) — other  costs  are  trivial 


Full  efficiency  story  is  fairly  complex — see  Section  7.4  of 
ISSO  and  references  therein 
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Efficiency  Analysis  (cont’d) 

•  Will  compare  SPSA  and  FDSA  by  looking  at  relative  mean 
square  error  (MSE)  of  0  estimate 


•  Consider  relative  MSE  for  same  no.  of  measurements,  n 
(not  same  no.  of  iterations).  Under  regularity 
conditions  above: 


as  n  — ^  oc 

•  Equivalently,  to  achieve  same  asymptotic  MSE 

no.  meas.  y(0)  in  SPSA  _  1 
no.  meas.  y(0)  in  FDSA  p 


Results  ©  and  ©  ©  are  main  theoretical  results 
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Paraphrase  of  ©  ©  above: 


•  SPSA  and  FDSA  converge  in  same  number  of  iterations 
despite  p-fold  savings  in  cost/iteration  for  SPSA _ 


—  or  — 


•  One  properly  generated  simultaneous  random  change  of 
all  variables  in  a  problem  contains  as  much  information 
for  optimization  as  a  full  set  of  one-at-a-time  changes 
of  each  variable 
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D.  PRACTICAL  GUIDELINES  AND 

MATLAB  CODE 


The  code  below  implements  SPSA  iterations  k  =  1,2 

-  Initialization  for  program  variables  theta,  alpha,  etc. 
not  shown  since  that  can  be  handled  in  numerous  ways 
(e.g.,  file  read,  direct  inclusion,  input  during  execution) 

-  Ak  elements  are  generated  by  Bernoulli  ±1 

-  Program  calls  external  function  loss  to  obtain  y(0) 
values 


Simple  enhancements  possible  to  increase  algorithm  stability 
and/or  speed  convergence 

-  Check  for  simple  constraint  violation  (shown  at  bottom  of 
sample  code) 


w 

-  Reject  iteration  k  -»  k  + 1  if  y(0/<+1)  is  too  much  greater 
than  y(Qk)  (requires  extra  loss  measurement  per  iteration) 

-  Reject  iteration  k  -> k  + 1  if  0fc+1  -  Qk  is  too  large  (does 
not  require  extra  loss  measurement) 
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Matlab  Code 


for  k=l : n 

ak=a/ (k+A) Aalpha; 
ck=c/kAgamma; 

delta=2* round (rand (p, 1 ) ) -1 ; 

thetaplus=theta+ck*delta; 

thetaminus=theta-ck*delta; 

yplus=loss (thetaplus) ; 

yminus=loss (thetaminus) ; 

ghat= (yplus-yminus ) ./ (2*ck*delta) ; 

theta=theta-ak*ghat ; 

end 

theta 

If  maximum  and  minimum  values  on  elements  of  theta  can  be 
specified,  say  thetamax  and  thetamin,  then  two  lines  can  be 
added  below  theta  update  line  to  impose  constraints: 

theta=min (theta, thetamax) ; 
theta=max (theta, thetamin) ; 
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E.  APPLICATION  OF  SPSA 


•  Numerical  Study:  SPSA  vs.  FDSA 

•  Consider  problem  of  developing  neural  net  controller 
(wastewater  treatment  plant  where  objectives  are  clean  water 
and  methane  gas  production) 

•  Neural  net  is  function  approximator  that  takes  current 
information  about  the  state  of  system  and  produces  control 
action 

•  Lk(Q)  =  tracking  error, 

0  =  neural  net  weights 

•  Need  to  estimate  0  in  real-time;  used  nondecaying  ak  =  a,  ck  = 
c  due  to  nonstationary  dynamics 

•  p  =  dim(0)  =  412 

•  More  information  in  Example  7.4  of  ISSO 
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Wastewater  Treatment  System 


RMS  Tracking  Error  (Lk (0)) 


RMS  Error  for  Controller 
in  Wastewater  Treatment  Model 


0  - 1 - ~T~ - 

SPSA  0  40  80  120  160 

FDSA  0  16,480  32,960  49,440  65,920 


Number  of  Measurements  (Cost  of  Algorithms) 


F.  ADAPTIVE  SIMULTANEOUS 
PERTURBATION  METHOD 

•  Standard  SPSA  exhibits  common  “Ist-order”  behavior 

-  Sharp  initial  decline 

-  Slow  convergence  in  final  phase 

-  Sensitivity  to  units/scaling  for  elements  of  0 


•  “2nd-order”  form  of  SPSA  exists  for  speeding  convergence, 
especially  in  final  phase  (analogous  to  Newton-Raphson) 

-  Adaptive  simultaneous  perturbation  (ASP)  method  (details 
in  Section  7.8  of  ISSO) 


ASP  based  on  adaptively  estimating  Hessian  matrix 

v  '  aeaer 


Addresses  long-standing  problem  of  finding  “easy”  method 
for  Hessian  estimation 


•  Also  has  uses  in  nonoptimization  applications  (e.g.,  Fisher 
information  matrix  in  Subsection  13.3.5  of  ISSO) 
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Overview  of  ASP 

•  ASP  applies  in  either 

(i)  Standard  SPSA  setting  where  only  L(0)  measurements 
are  available  (as  considered  earlier)  (“2SPSA”  algorithm) 

—  or  — 

(ii)  Stochastic  gradient  (SG)  setting  where  L(0)  and  g(0) 
measurements  are  available  (“2SG”  algorithm) 

•  Advantages  of  2nd-order  approach 

—  Potential  for  speedier  convergence 

—  Transform  invariance  (algorithm  performance  unaffected 
by  relative  magnitude  of  0  elements) 

•  Transform  invariance  is  unique  to  2nd-order  algorithms 

—  Allows  for  arbitrary  scaling  of  0  elements 

—  Implies  ASP  automatically  adjusts  to  chosen  units  for  0 
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Cost  of  Implementation 


For  any  p,  the  cost  per  iteration  of  ASP  is 


Four  loss  measurements  for  2SPSA 

—  or  — 

Three  gradient  measurements  for  2SG 


•  Above  costs  for  ASP  compare  very  favorably  with  previous 
methods: 

0(p2)  loss  measurements  (y(*))  per  iteration  in  FDSA  setting 
(e.g.,  Fabian,  1971) 

O(p)  gradient  measurements  per  iteration  in  SG  setting 
(e.g.,  Ruppert,  1985) 


If  gradient/Hessian  averaging  or  y(*)-based  iterate  blocking  is 
used,  then  additional  measurements  needed  per  iteration 
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Efficiency  Analysis  for  ASP 


•  Can  use  asymptotic  normality  of  2SPSA  and  2SG  to 
compare  asymptotic  RMS  errors  (as  in  basic  SPSA)  against 
best  possible  asymptotic  RMS  of  SPSA  and  SG,  say 
RMS*spsa  and  RMS*sg 


2SPSA:  With  ak  =1  Ik  and  ck  =  c  lkV6  (k  >  1 ) 


RMS  of  2 SPSA 


RMS 


<  2  Vc>0 


SPSA 


2SG :  With  ak=  Mk  and  any  valid  ck 


•  Interpretation:  2SPSA  (with  ak  =  Mk)  does  almost  as  well  as 
unobtainable  best  SPSA;  RMS  error  differs  by  <  factor  of  2 

•  2SG  (with  ak  =  Mk)  does  as  well  as  the  analytically  optimal 
SG  (rarely  available) 
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G. EXTENSIONS  AND  FURTHER  RESULTS 


•  There  are  variations  and  enhancements  to  “standard” 
SPSA  of  Section  B 

•  Section  7.7  of  ISSO  discusses: 

(i)  Enhanced  convergence  through  gradient 
averaging/smoothing 

(ii)  Constrained  optimization 

(iii)  Optimal  choice  of  distribution 

(iv)  One-measurement  form  of  SPSA 

(v)  Global  optimization 

(vi)  Noncontinuous  (discrete)  optimization 
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(i)  Gradient  Averaging  and  Gradient 

Smoothing 

•  These  approaches  may  yield  improved  convergence  in  some 
cases 

•  In  gradient  averaging  gk(Qk)  is  simply  replaced  by  the 
average  of  several  (say,  q)  SP  gradient  estimates 

-  This  approach  uses  2 q  values  of  y(»)  per  iteration 

-  Spall  (1992)  establishes  theoretical  conditions  for  when  this 
is  advantageous,  i.e.,  when  lower  MSE  compensates  for 
greater  per-iteration  cost  (2 q  vs.  2,  q  >1) 

-  Essentially,  beneficial  in  a  high-noise  environment 
(consistent  with  intuition!) 

•  In  gradient  smoothing,  gradient  estimates  averaged  across 
iterations  according  to  scheme  that  carefully  balances  past 
estimates  with  current  estimate 

-  Analogous  to  “momentum”  in  neural  net/backpropagation 
literature 
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(ii)  Constrained  Optimization 

•  Most  practical  problems  involve  constraints  on  0 

•  Numerous  possible  ways  to  treat  constraints  (simple 
constraints  discussed  in  Section  D) 

•  One  approach  based  on  projections  (exploits  well-known 
Kuhn-Tucker  framework) 

•  Projection  approach  keeps  0^  and  Qk  ±  ckAk  in  valid  region 
for  all  k  by  projecting  Qk  into  a  region  interior  to  the  valid 
region 

-  Desirable  in  real  systems  to  keep  Qk  ±  ckAk  (in  addition 
to  0/)  inside  valid  region  to  ensure  physically  achievable 
solution  while  iterating 

•  Penalty  functions  are  general  approach  that  may  be  easier 
to  use  than  projections 

-  However,  penalty  functions  require  care  for  efficient 
implementation 
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(iii)  Optimal  Choice  of  Distribution 

•  Sections  7.3  and  7.4  of  ISSO  discuss  sufficient  conditions 
for  Ak  distribution  (see  also  Sections  B  and  C  here) 

-  These  conditions  guide  user  since  user  typically  has  full 
control  over  distribution 

-  Uniform  and  normal  distributions  do  not  satisfy  conditions 

•  Asymptotic  distribution  theory  shows  that  symmetric 
Bernoulli  distribution  is  asymptotically  optimal 

-  Optimal  in  both  an  MSE  and  nearness-probability  sense 

-  Symmetric  Bernoulli  is  trivial  to  generate  by  Monte  Carlo 

•  Symmetric  Bernoulli  seems  optimal  in  many  practical  (finite- 
sample)  problems 

-  One  exception  mentioned  in  Section  7.7  of  ISSO  (robot 
control  problem):  segmented  uniform  distribution 
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(iv)  One-Measurement  SPSA 

•  Standard  SPSA  use  two  loss  function  measurements/iteration 


One-measurement  SPSA  based  on  gradient  approximation 

y{Qk  +  ck  Ak ) 


ck  A/d 


9k(®k)~ 


y{0k  +  ckAk ) 


ck^kp 


•  As  with  two-measurement  SPSA  this  form  is  unbiased 
estimate  of  g(Qk)  to  within  0(ck) 

•  Theory  shows  standard  two-measurement  form  generally 
preferable  in  terms  of  total  measurements  needed  for  effective 
convergence 

-  However,  in  some  settings,  one-measurement  form  is 
preferable 

-  One  such  setting:  control  problems  with  significant 
nonstationarities 
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(v)  Global  Optimization 


•  SPSA  has  demonstrated  significant  effectiveness  in  global 
optimization  where  there  may  be  multiple  (local)  minima 

•  One  approach  is  to  inject  Gaussian  noise  to  right-hand 
side  of  standard  SPSA  recursion: 

Qk+i=Qk-akgk(Qk)  +  bkwk  (*) 

where  bk^>  0  and  wk  ~  N(0,Ipxp) 

•  Injected  noise  wk  generated  by  Monte  Carlo 

•  Eqn.  (*)  has  theoretical  basis  for  formal  convergence 
(Section  8.4  of  ISSO) 
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(v)  Global  Optimization  (Cont’d) 

•  Recent  results  show  that  bk  =  0  is  sufficient  for  global 
convergence  in  many  cases  (Section  8.4  of  ISSO) 

-  No  injected  noise  needed  for  global  convergence 

-  Implies  standard  SPSA  is  global  optimizer  under 
appropriate  conditions 

•  Numerical  demo  on  some  tough  global  problems  with 
many  local  minima  yield  global  solution 

-  Neither  genetic  algorithms  nor  simulated  annealing  able  to 
find  global  minima  in  test  suite 

-  No  guarantee  of  analogous  relative  behavior  on  other 
problems 

•  Regularity  conditions  for  global  convergence  of  SPSA 
difficult  to  check 


(vi)  IMoncontinuous  (Discrete)  Optimization 

•  Basic  SPSA  framework  for  L(0)  differentiable  in  0 

•  Many  important  problems  have  elements  in  0  taking  only 
discrete  (e.g.,  integer)  values 

•  There  have  been  extensions  to  SPSA  to  allow  for  discrete  0 

-  Brief  discussion  in  Section  7.7  of  /SSO;  see  also  references 
at  SPSA  Web  site 

•  SP  estimate  gk(Qk )  produces  descent  information  although 
gradient  not  defined 

•  Key  issue  in  implementation  is  to  control  iterations  Qk  and 
perturbations  Qk  ±  ckAk  to  ensure  they  are  valid  0  values 
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Contact  and  Other  Information 

Contact:  James  C.  Spall 

james.spall@jhuapl.  edu 
240-228-4960 
SPSA  web  site 

www.jhuapl.  edu/SPSA 

Additional  relevant  information  at  site  for  related  book 
Introduction  to  Stochastic  Search  and  Optimization 

www.jhuapl.  edu/ISSO 

Tutorial  paper  (available  at  SPSA  web  site): 

Spall,  J.  C.  (1998),  “An  Overview  of  the  Simultaneous 
Perturbation  Method  for  Efficient  Optimization,”  Johns 
Hopkins  APL  Technical  Digest,  vol.  19,  pp.  482-492. 


